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A gene trap approach in mouse 
embryonic stem cells: the lacZ reporter 
is activated by splicing, 
reflects endogenous gene expression, 
and is mutagenic in mice 
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We have confirmed that the gene trap vector pGT4.S creates spliced fusion transcripts with endogenous genes 
and prevents the synthesis of normal transcripts at the site of integration. cDNA was prepared to the lacZ 
fusion transcript in three ES cell lines to recover endogenous exon sequences upstream of lacZ. Each of the 
clones detected a unique-sized endogenous transcript, as well as the fusion transcript in the ES cell line from 
which the clone was derived. Sequence analysis of these clones and larger clones isolated from a 
random-primed cDNA library showed that the splice acceptor was used properly. For two insertions, the 
expression patterns of the lacZ reporter and the associated endogenous gene were compared in situ at three 
embryonic stages and were found to be similar. Three gene trap insertions were transmitted into the germ 
line, and abnormalities were observed with two of the three insertions in the homozygous state, RNA 
obtained from mice homozygous for the two mutant gene trap insertions was analyzed for normal endogenous 
transcripts and negligible amounts were detected, indicating that little splicing around the gene trap insertion 
occurred. This work demonstrates the capacity of the gene trap vector to generate lacZ fusion transcripts, to 
accurately report endogenous gene expression, and to mutate the endogenous gene at the site of integration. 

[Key Words: Gene trap vector; mouse mutants; embryonic stem cells; lacZ reporter; endogenous gene 
expression! 
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A strategy to monitor transcriptionally active regions of 
the genome was first described in bacteria; it involves 
the introduction of reporter constructs into the genome 
that require the acquisition of ci5 -acting DNA sequences 
to activate reporter gene expression (Casadaban and Co- 
hen 1979). In this way, genes are identified based on ex- 
pression information and subsequently cloned from 
DNA sequences flanking the site of insertion. This ap- 
proach has been applied more recently in higher organ- 
isms using modified vectors suitable for eukaryoric tran- 
scription units (for review, see Bellen et al. 1990; Skames 
1990). 

One type of vector, termed enhancer traps, was de- 
signed to capitalize on the observation that cellular en- 
hancers possess the ability to activate (or repress) tran- 
scription over a distance of several kilobase pairs (kbp) 
independent of their orientation with respect to a gene 
and are capable of acting upon heterologous promoters 
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{Fried et al. 1983 ; Weber et al. 1984; Hamada 1986a, 
1986b). This approach provided an efficient method to 
identify enhancers that activated or repressed transcrip- 
tion of a reporter containing a minimum promoter fol- 
lowing differentiation of cultured cell lines (Bhat et al. 
1988; Okamoto et al. 1990). 

In Diosophila, the enhancer trap strategy has been 
used successfully in large-scale screens for genes ex- 
pressed at particular developmental stages or in particu- 
lar lineages (OKane and Ge bring 1987; Fasano and Ker- 
ridge 1988; Fasano et al. 1988; Bellen et al. 1989; Bier et 
al. 1989; Wilson et al. 1989; for review, see Bellen et al. 
1990). Using P-element-based vectors, thousands of in- 
sertions have been generated to monitor chromosomal 
loci active during embryogenesis. The JacZ reporter pro- 
vides a sensitive and easily assayable gene product to 
detect expression in whole embryos. Approximately 
65% of the lines tested were found to express the lacZ 
reporter in a restricted pattern during embryogenesis. 
Furthermore, -15% of the P-element insertions caused 
recessive mutations that resulted in visible phenotypes. 
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In similar studies, one in five transgenic mice carrying 
enhancer trap vectors were found to exhibit unique tem- 
poral and spatial patterns of lacZ expression (Alien et al. 
1988; Kothary et al. 1988). In one case, the integration 
event resulted in a mutant phenotype (Kothary et al. 

1988) . To effectively use this strategy on a large scale in 
mice, we introduced lacZ reporter constructs into mouse 
embryonic stem |ES) cells (Gossler et al. 1989J. Large 
numbers of integration events can be rapidly generated 
in ES cells, and the pattern of lacZ expression can be 
^si\y assayed in ES cell-derived chimeric embryos. 

The ability to prescreen ES cells for desired insertion 
events also made possible the design of a second type of 
vector that we have termed the gene trap (Gossler et al. 

1989) . Gene trap vectors were designed to generate 
spliced fusion transcripts between the reporter gene and 
the endogenous gene present at the site of integration 
(Brenner et al. 1989 ; Gossler et al. 1989; Kerr et al 1989; 
Friedrich and Soriano 1991). The essential feature of the 
gene trap design is the placement of a splice acceptor in 
front of a promoterless lacZ gene. Integrations of the 
vector into the intron of a gene in the correct orientation 
were predicted to create JacZ fusion transcripts; and if 
the reading frames of the endogenous genes and lacZ are 
the same, an active p-galactosidase fusion protein should 
be produced. Vectors similar to the original bacterial vec- 
tors, termed promoter traps, have also been used (von 
Melchner et al. 1990; Hope 1991; Reddy et al. 1991). 
Because they contain only the coding sequences of the 
reporter gene, they are expected to require insertions 
into exons to activate reporter gene expression. 

If lacZ expression faithfully mimics that of the endog- 
enous gene, monitoring the lacZ fusion gene activity in 
embryos should enable one to readily visualize the pat- 
tern of endogenous gene expression during development. 
The accuracy of lacZ expression associated with en- 
hancer/gene/pTomoter trap insertions has not been re- 
ported in the mouse to date. In Drosophila, the majority 
of enhancer trap insertions tested express p-galactosi- 
dase in a pattern similar to the adjacent endogenous gene 
(Fasano et al. 1988 ; Bellen et al. 1989; Bier et al. 1989 ; 
Wilson ct al. 1989). The requirement for gene /promoter 
trap insertions to occur within transcription units may 
well favor an accurate portrayal of endogenous gene ex- 
pression. 

The introduction of exogenous DNA into the mouse 
germ line has the potential to generate mutations. Ap- 
proximately 5% of retrovirus and 10-20% of transgene 
insertions have been found to cause recessive pheno- 
types in the mouse (for review, see Jaenisch 1988). The 
mutated locus can be cloned from genomic DNA flank- 
ing the site of transgene insertion. However, the identi- 
fication of transcription units from flanking sequences 
requires considerable effort. Consequently, the molecu- 
lar characterization of genes associated with only one 
transgene-induced mutation (Maas et al. 1990; Woychik 
et al. 1990) and three retro virus-induced mutations 
(Schnieke et al. 1983; Gridley et al. 1990; Weiher et al. 

1990) has been published to date. 

Gene and promoter trap vectors may represent more 



powerful insertional mutagens in the mouse than other 
transgenes (Gossler et al. 1989; Friedrich and Soriano 
1991). This is predicted from the fact that these vectors 
can create lacZ fusion products with endogenous genes 
and, as a result, may interfere with the normal coding 
capacity of the endogenous gene and thereby create a 
mutation. Moreover, cloning a portion of the endoge- 
nous gene directly from the lacZ fusion transcript elim- 
inates the time-consuming task of searching for exons in 
flanking genomic DNA. 

Before using a gene trap approach in a large-scale 
screen to identify novel developmen tally regulated 
genes, it was necessary to validate that the gene trap 
vector was functioning as predicted. In this study we 
characterized four ES cell lines that carried independent 
integrations of a gene trap vector, pGT4.5, and expressed 
p-galactosidase in ES cells and during erabryogenesis. 
From three cell lines, a portion of the endogenous gene 
was cloned from the lacZ fusion transcripts; and in each 
case, the splice acceptor in the gene trap vector was used 
properly. The three endogenous genes identified were 
novel, one of which encodes a zinc finger-containing pro- 
tein. 

The fidelity of the lacZ reporter gene to reflect the 
expression pattern of the endogenous gene was investi- 
gated by comparing the endogenous transcript distribu- 
tion with the sites of fi-galactosidase expression. To di- 
rectly test the potential of the gene trap insertions to 
interrupt normal splicing of the endogenous gene, we 
examined RNA from mice homozygous for two inser- 
tions. Splicing around the gene trap insertion was esti- 
mated to generate <0.l% of normal endogenous tran- 
script levels in homozygous tissues. Moreover, pheno- 
typic abnormalities were observed for two of three 
insertions transmitted into the germ line. Our studies 
with the pGT4.S vector demonstrate the feasibility of a 
gene trap approach to identify, mutate, and clone genes 
important for normal mouse development and physiology. 

Results 

In this study we concentrated on four pGT4.5 gene trap 
integrations: Two (Gt2 and Ct4-2> were presumed to be 
constitutively expressed, and two (Gt4-1 and GtlO) dis- 
played interesting patterns of expression between 8.5 
and 12.5 days of development (Gossler et al. 1989). The 
endogenous genes associated with the Gt4-1, Gt4-2, and 
GtlO insertions were cloned (see below), allowing a com- 
parison to be made of the distribution of the endogenous 
transcripts and lacZ activity. The Gt2, Gt4-1, and Gt4-2 
ES cell lines were transmitted into the germ line, and the 
mutant phenotypes associated with these • insertions 
were assessed. lacZ expression for the Gt2, Gt4-1, and 
Gt4-2 inserts was analyzed in these transgenic embryos. 
For the analysis of the GtlO lacZ expression, ES cell 
chimeric embryos were generated. 

The gene trap vector creates lacZ fusion transcripts 
with endogenous genes 

In our earlier studies, differences in the subcellular lo- 
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calization of the B-galactosidase activ.ty suggested that 
the splice acceptor of the gene trap vector pGT4.5 was 
used to produce unique lacZ fusion products (Gossler et 
al 1989) To examine whether each gene trap insertion 
expressed a unique-sized lacZ fusion transcript, RNA 
was purified from seven gene trap cell lines and analyzed 
by Northern blot hybridization with a lacZ probe (Fig. 2, 
below and data not shown] The fusion transcripts var- 
ied from 3.4 to 10 kb in size, although most (5 of 7} were 
<4 kb Because the /acZ-containingexon should contrib- 
ute 3.3 kb, the majority of the fusion transcripts included 
<0.7 kb ol the endogenous gene. 

The rapid amplification of cDNA ends (RACE) proto- 
col IFrohman et al. 1988) was used to clone cDNA se- 
quences spanning the lacZ splice junction from three 
cell lines. The amplified products contained 150 bp ot 
the En-2 sequences downstream of the splice junction 
cDNAs generated from RNA from the cell lines Gt4-1 
and GtlO were -0.5 kb in length, the predicted length ol 
cDNA products that extend to the 5' end of the fusion 
transcripts. In contrast, a heterogenous smear ol cDNA 
was amplified from RNA from cell line Gt4 2, m keeping 
with the estimated 7 kb of endogenous sequence P»«*n« 
in the fusion transcript. From each cell line, amplified 
cDNA -0.5 kb in size was size selected and cloned. 

The gene trap splice acceptor is used properly 
to create novel 0-galactosidase fusion products 

At least three independent cDNA clones of different 
sizes from each cell line were sequenced. Figure 1 shows 
the nucleotide sequence of the En-2 splice acceptor used 
in the gene trap vector and the sequence of the longest 
amplified cDNA obtained for each cell line In each cell 
line novel sequences were found upstream of the splice 
junction and none of the clones contained En-2 intron 
sequences. As expected, because IccZ did not contam a 
translation initiation codon, an open reading frame 
|ORF) extended upstream of the splice junction that was 

in-frame with lacZ. 

The sequence of the Gt4 1 and GtlO clones which 
contained most of the 5' ends of these genes, had equiv- 
alent numbers of CpG and CpC dinucleotides at their 5 
ends, suggesting that these two genes are associated with 
CpG islands (Bird 1986) The Gt4 1 sequence |Fig. IBJ 
contained a single ORF with an ATG Kozak consensus 
sequence (Kozak 1987). The 5' end of the GtlO gene 
showed evidence of alternative splicing and/or alternate 
promoter use because two of the six clones contained 
unique sequences that diverged 312 bp upstream of the 
splice site and 66 bp upstream of a potential translation 
initiation signal (Pig. ID). An amount of 0.4 kb of the 
Gt4-2 fusion transcript was recovered, and it contained a 
single ORF compatible with lacZ (Fig. 1C). 

To further characterize the three endogenous genes, 
we attempted to obtain larger endogenous cDNA clones 
that spanned the splice junction by screening a cDNA 
library prepared from 12.5-day embryonic RNA (^Ma- 
terials and methods). Three overlapping Ct4-2 cDNA 
clones that span a total of 5.5 kb and a single 2-kb GtlO 
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cDNA clone were recovered. The GtlO cDNA began in 
the exon sequences common to all forms of the GtlO- 
amplificd cDNA clones and extended 1.7 kb downstream 
of the lacZ splice site. No cDNA clones for the Gt4-I 
gene were recovered. 

The nucleotide sequences downstream of the splice 
junction to lacZ were determined, and they showed that 
the reading frames identified in the original amplified 
cDNA clones continued (Fig. IC,D). It therefore appears 
that the Gt4-2 and GtlO gene trap insertions have inter- 
rupted the normal coding sequences of the associated 
endogenous gene. Taken together, these results demon- 
strate that the splice acceptor site in the pGT4.5 vector 
is used correctly. 

Amplified cDNA clones detect both the endogenous 
and fusion transcripts in ES cells 

The amplified cDNA clones were used to determine the 
sizes of the normal uninterrupted endogenous tran- 
scripts (Fig. 2 ; Table 1). Unique probes that hybridized to 
single bands on genomic Southern blots (data not shown) 
were used for Northern blot analysis of total ES cell RNA 
obtained from each cell line (Fig. 2). ... 

The amplified Gt4 2 probe detected a «0- kb / _ e "* , f 
nous transcript that was similar in size to the Ct4 2 fu- 
sion transcript (Fig. 2). A probe containing sequences 
common to all of the GtlO cDNA clones hybridized to a 
major endogenous 2.9 kb transcript and a minor 2.3 M> 
transcript in all ES cellular RNA and a 3.5-kb fus.on 
transcript in GtlO RNA. The Ct4 I probe ^~«*»n 
8-kb transcript in the Gt4 2 and GtlO ES cellular RNA 
but apparently only the 3.4-kb fusion transcript in the 

Gt4-1 line. , 

To investigate whether endogenous exons down- 
stream of the gene trap insertion were mcorporated into 
aberrant transcripts that were either spliced [abnormally 
or transcribed from the gene trap vector, the Northern 
blot was reprobed with GtlO and Gt4-2 exon probes 3 of 
the lacZ splice site. Only the normal-sized endogenous 
transcripts were detected (Fig 2). and, as expected, these 
sequences were not included in the lacZ fusion tran- 
scripts. 

Comparison of lacZ fusion and endogenous gene 
expression during development 

A description of the lacZ expression pattern during de- 
velopment was obtained from serially sectioned embryos 
at selected stages and in newborn mice (see Materials 
and methods). Table 1 summarizes the maior sites at 
B-galactosidasc activity during embryogenesis for each 
gene uap insertion. To examine the distribution of en- 
dogenous transcripts, in situ hybridizations were per 
formed using RNA probes derived from end ?S cn ° us 
Gt4-2 and GtlO cDNA clones. The short 0.2-M) t-ta-r 
antisense probe did not provide conclusive results, prob- 
ably due to a lack of sensitivity because the lacZ fusion 
product is expressed at very low levels. 
GtlO gene expression Staining of whole-mount 9S-day 
and sections of 10.S- and 12 5-day chimeric embryos gen- 
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Figure 1. Nucleotide and Amino acid se- 
quence of the gene trap splice acceptor and 
the endogenous genes associated with 
three gene trap cell lines. {A\ Sequence of 
the En-2/lacZ junction in the pGT4.5 gene 
trap vector. The En-2 splice acceptor re- 
gion is shown with intron sequences in 
lowercase letters and exon sequences in 
uppercase letters. The iacZ-coding se- 
quence lacked a translation initiation sig- 
nal and was fused in frame to En-2 with 
linker sequences. The splice sites are 
marked with a vertical arrow. The posi- 
tions of primers 170 and 2S6 used in the 
RACE procedure are indicated by arrow- 
heads. Sequences of endogenous cDNAs 
cloned from the Gt4-1 [B) t Ct4-2 [C], and 
GtlO \D) cell lines are shown. Endogenous 
cDNA sequences upstream (S*| of the gene 
trap splice acceptor in the hcZ fusion 
transcript were obtained from cDNAs 
cloned by the RACE procedure. The se- 
quence of endogenous exons downstream 
|3'| of the splice site was obtained from 
cDNAs cloned from a 12.5-day embryonic 
cDNA library (see text). The cluster of 
three zinc finger motifs present in the 
Gt4-2 endogenous gene is underlined and 
is followed by a short acidic domain fin 
bold). The primers used for amplifying 
Gt4-2 cDNA in Fig. 6b are indicated by the 
arrowheads. The sequence of the most 
abundant form of the GtlO transcript (four 
of six clones} is shown, and the point at 
which two of the six clones diverge is in- 
dicated by an asterisk (*). 
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crated with the GtlO ES cell line showed the highest 
levels of p-galactosidase activity in the otic vesicle, the 
dorsal and caudal aortae, the umbilical artery, and the 
primitive gut (Fig. 3a,d,g). The developing neural tube 
was devoid of staining. Cross sections of 12.5-day chi- 
meras showed high expression in the gut and intestinal 
loops (Fig. 3gJ. Moderate levels of expression were found 
in the skin, lung, and a dorsal-lateral region of mesen- 
chymal cells. Expression in the nervous system, liver, 
and heart was not detectable at this stage. 

The results obtained with the GtlO probes showed 
that the endogenous transcript had a similar distribution 
to p-galactosidase activity (Fig. 3), except that the distri- 
bution of the GtlO endogenous transcript appeared to be 
broader at 9.5 days than the lacZ expression. For exam- 



ple, strong hybridization was observed in the pharyngeal 
arches and limb buds, where only a few lacZ staining 
cells were observed in GtlO chimeric embryos. These 
differences, however, may be the result of incomplete ES 
cell contribution in the chimeric embryos. Although we 
did not examine 15.S-day chimeric embryos for lacZ ex- 
pression, in situ hybridization experiments showed high 
transcript levels in the intestine, bladder, dorsal aortae, 
and other blood vessels (Fig. 3i). The pattern of expres- 
sion seen at 15.5 days was therefore similar to the sites of 
expression seen at earlier stages. 

Ct4-2 expression From our preliminary analysis of 
Ct4-2 expression in whole-mount chimeric embryos, we 
reported that the Gt4-2 fusion gene product was ex- 
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Figure 2- lacZ fusion and endogenous transcripts associated 
with three gene trap cell lines Northern blot analysis of 5 of 
total ES cellular RN A from the Ct4 1 , Cr4 2. and Gt 10 cell lines 
hybridized with the probes indicated The same blot was used 
for each probe Asterisks ( * \ show the position of the focZ fusion 
transcript identified with the lacZ probe The normal endoge- 
nous transcripts are indicated by arrows. Autoradiographs with 
the Ci4-2 and GtlO 3' probes were overexposed to aUow foi 
detection of rare abnormal transcripts. 



pressed constitutively between 8-5 and 12.5 days of ges- 
tation (Gossler et al. 1989) A more detailed analysis of 
sections of chimeric and transgenic embryos at these 
stages showed that lacZ expression was absent in a sub- 
set of tissues, notably, the yolk sac endoderm, the heart, 
and the liver. Highest levels of expression were seen in 
the developing peripheral and central nervous systems 
(PNSand CNS, respectively! |Fig. 4a,c,e,g|. At 15.5 days, 
lacZ expression was restricted to almost all cells of the 
PNS and CNS. The submandibular gland was one of the 
few non-neuronal tissues that expressed the reporter 
gene. 

The distribution and relative levels of the endogenous 
Gt4-2 transcript matched the sites of lacZ expression at 
all stages examined (Fig 4b,d,f,h). In 9.5- and 12-5-day 
embryos, the Ct4-2 transcript was distributed widely 
throughout the embryo, with the highest levels found in 
the neural tube and little or no signal in the heart and 
liver (Fig. 4b,d,f|. At 15 5 days, expression was strong in 
all regions of the brain, spinal cord, and dorsal root gan- 
glia, with lower levels in the submandibular gland |Fig. 
4h).' 



Gene trap insertions can cause recessive abnormalities 

To test whether the GT4.5 insertions caused phenotypic 
abnormalities, we produced germ-line chimeric mice 
from three ES cell lines: Gt2, Gt4-1, and Gt4 2 (data not 
shown). The GtlO cell line did not give rise to germ-line 
chimeras. Agouti offspring from ma tings of germ -line 
chimeras were genotyped by Southern blot analysis with 
a lacZ probe, and, as expected, -50% of the mice con- 
tained the gene trap insertion (data not shown). 

Offspring heterozygous for the gene trap insertions 
were backcrossed one generation to allow random segre- 
gation of any potential secondary mutations acquired 
during ES cell culture. The expected 1 : I ratio of wild- 
type and heterozygous animals were bom and reached 
adulthood with each insertion (Table 2). Heterozygous 
animals obtained from the backcross were then inter- 
crossed to determine the phenotype of homozygous an- 
imals. The breeding results are summarized in Table 2. 

Gt4-2 homozygotes die or are growth retarded A re- 
striction fragment length polymorphism (RFLP) caused 
by the Gt4-2 gene trap insertion and detected with a 
Gt4-2 5' exon probe (Fig. 5} was used to determine the 
genotype of Gt4-2 offspring. Roughly, the expected 
1:2:1 frequency of wild-type, heterozygous, and ho- 
mozygous pups was present in a total of 70 pups ana- 
lyzed at weaning on a mixed genetic background (Table 
2|. However, homozygous pups were visibly smaller than 
their wild-type littermates of the same sex at weaning, 
and differences in growth rates first became apparent at 
1-2 weeks of age. Animals were weighed at 4-5 weeks 
and at 8 weeks: Homozygous animals were 50-75% the 
weight of their wild-type littermates at 4~S weeks and 
near normal (90%) body weight at 8 weeks (data not 
shown). On the inbred 129/Sv genetic background (Table 
2), fewer than the expected number of Ct4-2 homozy- 
gous animals were obtained (1 of 31 live offspring) and 3 
of the 24 heterozygotes were half the normal size. The 
phenotypic consequences of the Ct4-2 defect therefore 
seems to be more severe in the 129/Sv genetic back- 
ground, causing lethality in the mutant homozygous 
state and a semidominant effect on growth in the het- 
erozygous state. 

Gta i homozygotes die at birth To give an indication 
of the expression pattern of the endogenous Gt4-1 gene, 
the pattern of lacZ expression was assessed. Transgenic 
Gt4-l embryos displayed widespread lacZ activity in 
both extraembryonic and embryonic lineages from 6.5 
days to birth. The level of expression varied in different 
tissues but generally was low in the embryo. In the ex- 
traembryonic lineages, particularly in the ectoplacental 
cone at 8 days (Fig- 6a) and giant cells of 12.5-day pla- 
centa (data not shown), expression was high. In the tis- 
sues showing the lowest levels, not all of the cells of a 
given cell type showed staining. *■ 

At 8.5 and 9.5 days, staining of whole-mount embryos 
showed interesting spatial and temporal patterns of 0-ga- 
lactosidase activity. Two stripes of more strongly ex- 
pressing cells were observed in the developing hindbrain 



.569059A I > 



GENES A DEVELOPMENT 



907 



I 



Skxrmcs ct aL 



Table 1. Summary of the chaiacttiizauoD of /out gene trap cell lines 



Gene trap 
insertion 



Gt4 I 
Ct4-2 

CtlO 
Gi2 



Transcript size |kb) 



Subcellular* 
local 



JacZ 
fusion* 



endogenous 



cytoplasm 

(dot) 
nuclear 



cytoplasm 

(vesicles! 
nuclear 



3.6 
10 

3.6 
ND 



8 

10 



2.9 
23 
ND 



Gene 
product 



zinc ringer 
protein 

proline -rich 

protein 
ND 



Expression* 
pattern 



widespread (placenta) 
restricted (pan-neural) 

restricted (otocyst, bv/ 

gut, bladder) 
widespread (neural, 

kidney, gut) 



1 really differentiated ES cell monolayers (Ccssler et al 1989). 
S uap vc; t or sequences contribute -33 kb to the sue of the fusion transenpt. 
-Tissues showing highest level of expression are shown m parentheses, 
**Blood vessels. 



Homozygous 
phenotype 



perinatal 
lethal 

growth 
retarded 
or lethal 

ND 

normal 



of 8.5- to 9-5-day embryos {Fig- 6b). These stnpes ap- 
peared to demarcate rhombomeres 3 and 5, similar xo 
Kiox-20 expression at this stage {Wilkinson et al. 1989). 
At 9 5 days, stronger staining appeared in the otic vesicle 
and a dorsal /ventral gradient of expression was evident 
with higher expression on the ventral side of the embryo 
(data not shown). At 12.5 days and later stages, p-galac- 
tosidase activity was found in almost all tissues but at 
variable levels. The heart showed the highest expression. 
The mesenchymal cells of the developing lung and gut 
[Fig - 6c) expressed moderate levels of 0-gaIactosidase ac- 
tivity, whereas the epithelial components of the lung 
and gut were devoid of expression. 

The endogenous Gt4 1 cDNA clone did not detect 
RLFPs caused by the gene trap insertion with 12 restnc 
tion enzymes tested Idata not shown). The mice were 
therefore genotyped with an En-2 probe by companng 
the intensity of the vector band relative to the endoge- 
nous En-2 bands (Fig- 5}. In > 150 pups analyzed atwean- 
ing age from a heterozygous intercross, the expected 1 : 2 
ratio between wild-type and heterozygous pups was 
found but there was a complete absence of homozygous 
animals. Unexpectedly, we identified two viable ho- 
mozygous animals that survived to adulthood in 22 off- 
spring from a breeding pair that was obtained from a 
CD-I (outbred) backcross (Table 2). These findings sug- 
gest that there is a strain-dependent penetrance of the 
lethal phenotype. 



We found that the intensity of lacZ staining provided 
a convenient indicator of the genotype of embryos. In 
litters examined at various stages of development 
roughly one-quarter of the embryos did not stain and 
one-quarter stained more intensely than the remaining 
embryos. Southern blot analysis confirmed that the most 
darkly stained embryos were homozygous for the gene 
trap insertion. Embryos homozygous for the Gt4-1 inser- 
tion were found in Utters up to the time of birth, and 
from 17 days to birth these embryos displayed open eye- 
lids Occasionally, live-born pups with open eyelids were 
found, which died within a week. A histological "ag- 
nation of serial sections through homozygous and wild- 
type neonates revealed no obvious defects in the mu- 
tants. 

The Gt2 insertion has no phenotypic effect Although 
no sequence data are available on the Ct2 endogenous 
gene the fusion product of the Gt2 cell line was found to 
be chromatin associated {Gossler et al. 1989), suggesting 
that the Ct2 gene product is a nuclear protein. lacZ 
staining of Gt2 cells could be seen at the metaphasc plate 
of dividing cells, a phenomenon not observed with the 
nuclear Gt4-2 fusion protein (data not shown). To give an 
indication of the expression pattern of the endogenous 
Gt2 gene, the pattern of lacZ expression was assessed- 
Gt2 expression was observed in all tissues at the stages 
examined {Fig. 6d-f). At 8.5 and 9.5 days, the Gt2 fusion 
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product was expressed in all cell types, with the highest 
level in the developing neural tube (Fig. 6d and data not 
shown). At 12.5 days and later stages, staining in the 
forebrain diminished compared with the midbrain and 
spinal cord (Fig. 6e). High levels of expression were ob- 
served in the gut epithelium, in a subset of tubules in the 
metanephros and mesonephros, and in condensing bone. 

Only one transgenic animal was obtained from breed- 
ing of tb , Zt2 germ-line chimera. The mice were geno- 
typed v the En-2 probe used for the analysis of Gt4-1 
mice {F; if. The genomic structure of the Gt2 insertion 
in transgenic animals was found to differ from that of the 
ongina HI line. In the transgenic mice, one of the vec- 
tor banc, present in ES cells in at least two copies was 
lost an<f ^ additional single-copy band was detected 
[data n:» bowD). The change in restriction pattern is 
likely tlx* result of a rearrangement that occurred in the 
chimeric mouse. Fertile homozygous pups were present 
and inalisnnguishable from their littermates at the ex- 
pected frequency in a total of 28 offspring (Table 2}. Be- 
cause r ariable penetrance of the Ct4-1 and Gt4-2 ho- 
mozys phenotype was observed in animals of differ- 
ent ger uc backgrounds, the original ¥ x Gt2 transgenic 
animal was backcrossed three times to 129/Sv mice. 
Heterc ygotes were then intercrossed, and among a total 
of 33 spring no obvious deviation from the expected 
Mend .". ratio was observed (Table 2). 

The gt nap insertion prevents the synthesis 
of no: endogenous transcripts 

One -\ anation for the mild phenotypes observed rela- 
tive e expression patterns of the three endogenous 
gene aat the insertions caused leaky mutations ow- 
ing r icing around the CT4.5 vector. Ribonuclease 
prou -i and reverse transcriptase-polymerase chain 
reac* ^T-PCR) experiments were used as a sensitive 
assa determine whether normal endogenous tran- 
scrir -re synthesized. RNA was prepared from Gt4-1 
and ES cells and from wild type, heterozygous, and 

hot ous Gt4-l 15.5-day embryos or Gt4-2 adult 
brax- s RNA prepared from homozygous Gt4-1 and 
Gt4 jsues, we did not detect appreciable levels of 
norr: - spliced endogenous transcripts (Fig. 7a). 

Tr 3R was used as a more sensitive assay of the 
amr * normally spliced Gt4-2 transcript in homozy- 
gou t brain RNA relative to wild-type levels (Fig. 
7b) strand cDNA was prepared from wild-type and 
horn: xvgous RNA, with a primer specific to Ct4-2 en- 
dogenous sequences downstream of the splice site (see 
Materials and methods). As an internal control, an En-2 
primer specific for the En-2 and lacZ fusion transcripts 



was also included in the reverse transcription reaction. 
Tenfold serial dilutions of the reaction were then sub- 
jected to 30 rounds of PCR, with primers designed to 
amplify normal endogenous Gt4-2 cDNA spanning the 
splice site, the lacZ fusion cDNA, and the En-2 tran- 
script spanning the splice site. The intensity of the Gt4-2 
signal relative to the En-2 signal in the undiluted cDNA 
sample from Gt4-2 homozygous RNA was less than the 
signal obtained with the 10" 3 dilution of wild-type 
cDNA. The coamplified En-2 signal showed a similar 
dilution profile for both samples, indicating that the in- 
put RNA and amplification efficiency was similar in 
both samples. This analysis indicates that <0.1% of the 
normal transcript was present in brain RNA from ho- 
mozygous mice. Thus, in the adult brain, where the 
Gt4-2 gene is expressed at high levels, negligible splicing 
around the gene trap sequences occurred. 

Discussion 

Our results confirm the prediction that the pGt4.5 gene 
trap vector can generate lacZ fusion transcripts with en- 
dogenous genes through the use of the splice acceptor 
site contained in the vector. The RACE protocol pro- 
vided a direct method to clone cDNA sequences present 
upstream of lacZ in the fusion transcripts. The nucle- 
otide sequences of these and additional cDNA clones 
demonstrated that the splice acceptor in the vector was 
used to join novel sequences to the 5' end of lacZ {Fig. 1 I- 
Furthermore, each cDNA probe containing sequences 5' 
of the lacZ splice site detected endogenous transcripts in 
ES cells, as well as a lacZ fusion transcript in the cell line 
from which the probe was derived (Fig. 2). This result 
provided definitive evidence that the endogenous gene 
interrupted by the gene trap insertion had been cloned. 

Integrations that activate lacZ may occur more often 
in large transcription units consisting of large introns. 
The relatively large size of the endogenous transcripts 
associated with the three gene trap integrations |3, 8, and 
10 kbj may indicate such a preference. Southern blot 
analyses with short endogenous cDNA probes upstream 
of the insertion site detected more than one fragment 
when DNA was digested with enzymes that do not cut 
within the probes (data not shown). This result indicates 
that the 5' cDNA clones are composed of more than one 
exon. Thus, the insertions do not appear to show a pref- 
erence for the first intron, as may be the case with ret- 
roviral insertions. The bias toward relatively short fu- 
sions may instead reflect the fact that the introns at the 
beginning of vertebrate genes tend to be much larger 
than introns farther 3' (Hawkins 1988). 

A search of the GenBank data base using both the nu- 
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Table 2. Summary of transgenic breeding 



Bret iing pair 



Gt- i/+ x +/ + 

* '+ x Gt4-2/ + 
G t x Gt4-2/ + 

C? + x +/ + 
Gt + x Gt4-1/ + 
Gt4 i/ + x Gt4-1/ + 

GO + x +/ + 
GuV+ x Gt2/ + 
Gtl'+ x Gt2/ + 



Genetic strain* 



(129 x C57fF, x C57 
|129 x C57IFJ x C57 
129 

(129 x C57]F t x C57 
|129 x C57JF, x C57 
|129 x C57lF l x GDI 

(129 x CDliF, x CD! % 
(129 x CDllFj x GDI 
|(129 x CD1JF, x 129JB3 



3—4 w.o. pups 



48 

26 
7 

47 
47 
11 

51 
7 
9 



43 

29 
24 c 

62 
108 
9 

61 
IS 
16 



NA 
16 
2 d 

NA 

O 
2 e 

NA 
6 
8 



Embryos 1 * 



+ / + 



+ /- 



-/- 



24 



48 



21 



(NA) Not applicable. 

•(C57) C57BL/6, (129) 129/Sv ; (CD-I) outbred. 

^ased on intensity of p-galactosidase staining and/or DNA analysis. 
*T*vo heterozygotes that were half the normal weight died. 

4 CHe homozveote that was half the normal weight died. 

SeTLenUe, one male, femie, homozygous genotype confirmed in backcrosses to CD-I females 



deotide and putative protein-coding sequences indicated 
that the endogenous genes associated with all three gene 
trap lines were novel. The ORF of the GtlO gene encodes 
an amino- terminal proline-rich domain (Fig. ID) but 
shows <30% nucleotide sequence identity with other 
proline-rich proteins in the data base. Three C2H2 zinc 
finger-coding motifs and an acidic region were present 
within the Gt4-2 endogenous gene immediately up- 
stream of the lacZ splice site (Fig. 1C). The zinc ringer 
motifs share only 30% amino acid identity and 40% nu- 
cleotide sequence identity with other members of this 
gene family. The identity of the Ct4-2 gene as a potential 
transcription factor readily explains the nuclear localiza- 
tion of the p-galactosidase fusion product (Cossler et al. 
1989). 

The activation of the lacZ reporter in the pGT4.5 gene 
-rap vector requires insertion into a transcription unit 
aad should be dependent on the enhancer, promoter, and 
- anslational signals of the endogenous locus. For the 
Gt4-2 and GtlO insertions, the distribution and relative 
abundance of the endogenous transcript matched well 
the distribution of p-galactosidase activity (Figs. 3 and 4). 
We expect that some integration events will alter the 
aormal regulatory mechanisms of the endogenous gene, 
flie GtlO endogenous transcript appeared to be ex- 
pressed more broadly than lacZ in 9.5-day embryos. 
These minor differences may reflect normal post-tran- 
scriptiona) regulation of the endogenous gene product or, 
as a consequence of the insertion, a perturbation in the 
normal transcriptional and/or post-transcriptional con- 
trol signals. A direct comparison of either endogenous 
and fusion transcript or protein levels will be required to 
distinguish between normal regulatory mechanisms or 
abnormal fusion gene expression. Nevertheless, our re- 
sults suggest that in most cases the lacZ expression pat- 
tern will reflect that of the endogenous gene. 

The unique patterns of lacZ expression observed with 
each gene trap insertion and the close correlation with 



endogenous gene expression for the two lines analyzed 
implies that the pGT4.5 vector does not contain intrin- 
sic enhancer activity. lacZ expression was not found in 
tissues devoid of endogenous transcripts, nor were obvi- 
ous patterns of lacZ expression found that were common 
to all insertion events. Therefore, sequences within the 
vector do not appear to be capable of activating lacZ 
expression outside of the normal domains of endogenous 
gene expression. 

lacZ staining provides a convenient, high-resolution 
assay to assess the overall pattern of endogenous gene 
expression. Furthermore, lacZ expression should aid in 
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Figure 5. Southern blot analysis to determine the genotype of 
transgenic mice, Sstl (Ct2)- or Bghl {Ct4-1 Indigested DNA from 
tails or embryos was hybridized to the £5 probe (sec Materials 
and methods). The E5 probe detects both transgenc (asterisks) 
and endogenous En 2 [En) fragments. The relative intensities of 
the transgene fragments compared to the endogenous En 2 hag- 
menu was used to determine the genotype of nonrransgenic 
( + / + L heterozygous { + / - L or homozygous (-/-) mutant an- 
imals. The 19- A- 1 probe (see Materials and methods), contain- 
ing endogenous Gt4-2 cDNA sequences, detected a new 2f>kb 
PstI fragment {arrow) associated with the gene trap insertion. 



<XP_ rT 569059A^> (£5t & DEVELOPMENT 



A. gene trap rpproacb in mouse ES eclb 





■>v*:*. ^VTV^ ** f ll / - ; * - • 





Kg™ «. Gt4- 1 and 02 endogenous -"senpt^ 

hS-t in the ectop.acental cone. W A G£ ^JaJTSSKS 2iS2. where expression of lacZ was absent. 

Kyt-e^^^ 

Cr2 kidney, showing high expression in the proiimal tubules (arrowhead). 



the phenotypic analysis of gene trap mutations. First, 
B-galactosidase activity marks the tissues that normally 
express the mutated gene. Delects in cells that express 
lacZ would indicate a cell autonomous gene function, 
whereas defects in nonexpressing cells may suggest a 
cell-nonautonomous role. Second, differences in the pat- 
terns of expression in heterozygous versus homozygous 
embryos may provide important clues regarding the un- 
derlying defects caused by the mutation. Finally, lacZ 
staining intensity can offer an easy way to genotype em- 
bryos. A comparison of lacZ -stained embryos allowed us 
to quickly pinpoint the stage at which the embryonic 
lethal mutation associated with the Gt4-1 insertion oc- 
curred. This should be of general applicability, particu- 
larly for those mutations that fail during preimplanta- 
tion or early postimplantation development. 
The pbenotypes associated with the three gene trap 



insertions are difficult to equate with the patterns of 
endogenous gene expression. Homozygous Gt4-1 mice 
die at birth and display an open eyelid phenotype. The 
perinatal lethal phenotype of the Gt4-1 mutation is fully 
penetrant in the C57BL/6 x 129/Sv hybrid background. 
However, viable homozygous mice were obtained in a 
complex genetic background that included both outbred 
and inbred chromosomes. The location of the Ct4 1 in- 
sertion on chromosome 16 iN-A. Jenkins, Dl. Gilbert, 
and N.G. Copeland, unpubl.) docs not correlate with any 
of the open eye mutants that have been mapped. lacZ 
expression in Gt4 1 mice was widespread, with intrigu- 
ing stripes of comparatively high expression in rhom- 
bomercs at 8.5 days and a dorsal-ventral gradient ol ex- 
pression at 9.5 days. The highest level of expression was 
found in the placenta. On the surface, the perinatal le- 
thal phenotype seems inconsistent with the pattern of 
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Figure 7. The synthesis of norma! endogenous transcripts is negligible in mice homozygous for two gene trap insertions. [A) 
Ribonudease protection analysis to detect the fusion and endogenous transcript levels in Gt4-1 and Gtl ES cells and 15.5-day Ct4-l 
embryos [left) Ct4-2 adult brain RNA [right). The protected RNA species corresponding to the fusion |f] and the endogenous le) 
transcripts are indicated. |B) PCR amplification of normal Gt4-2 transcripts from wild-type and homozygous Gt4-2 adult brain RNA. 
Southern blot is shown on the left of the PCR products amplified from serial 10-fold dilutions [O-SJ of cDNA prepared from { + / + ) and 
(_/-} RNA and detected with the 4-2M probe Iwhich detects the endogenous Ct4-2 transcript) and ES (which detects the En 2 and 
JacZ fusion transcripts) probes. 



gene expression at these early stages; however, it is pos- 
sible that subtle abnormalities arose as a consequence of 
defects manifested early in development. 

Homozygous Gt4-2 mice on a C57BL/6 x 129/Sv 
background showed a mild growth retardation after 
birth. However, on an inbred 129/Sv background, fewer 
than the expected number of homozygous mutants sur- 
vived to weaning and some heterozygotes showed a 
growth defect. The Gt4-2 endogenous gene is expressed 
widely during gastrulation and specifically in the CNS 
and PNS later in development. We speculate that the 
growth deficiency and lethality may involve a neurolog- 
ical disorder and/or an endocrine defect. In any case, it 
seems likely that only a subset of cells expressing the 
gene are responsible for the mutant phenotype. The 
Gt4-2 gene product contains a zinc finger domain and 
maps to chromosome 4 (N.A. Jenkins, D.J. Gilbert, and 
N.G- Copeland, unpubl.), proximal to the major urinary 
protein (Mup-1). The recessive neurological mutants 
vacillans |vc) and vrbiiler [wi] map to this region (sum- 
marized in Green 1989). Homozygous vc and wi animals 
are smaller than normal and exhibit neurological dis- 
orders. The defect in Gt4-2 homozygous mice and the 
expression of the Gt4-2 gene in the CNS and PNS 
raise the possibility that the Gt4-2 mutation is an allele 
of vc or wi. 



Mice that were homozygous for the Gt2 mutation 
were viable and fertile. The nuclear fusion product was 
distributed widely throughout development. Sections 
.through homozygous and wild-type neonates showed no 
gross abnormalities despite high levels of expression in 
the kidney and gut. 

Despite the broad expression patterns of the endoge- 
nous genes during the early stages of development, none 
of the mutations caused early embryonic lethal pheno- 
types One explanation to account for this finding is that 
the gene trap mutation does not completely abolish nor- 
mal gene activity. One possibility is that the insertions 
might create leaky mutations owing to partial splicing 
around the vectors. However, we have shown that neg- 
ligible amounts of the normal transcripts are present in 
RNA obtained from homozygous mutant Gt4-1 and 
Gt4-2 tissues. Furthermore, the insertions do not induce 
the synthesis of aberrant transcripts other than the fu- 
sion transcripts. Therefore, our results indicate that the 
splice acceptor and poly|A) signals of the pGT4.5 vector 
raw act efficiently to define the final exon of the fusion 
transcript and prevent splicing events around the inser- 
tion. Other gene trap vectors, depending on the individ- 
ual properties of the splice acceptor and poIyjA) signal, 
may not yield the same results. 

The fact that some insertions will generate sizable 
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a eaUctosidasc fusion piotcins raises the possibility thai 
these fusion products, in some cases, will have partial 
activity. Alternatively, they may behave as dominant- 
negative molecules. In this respect, it is interesting that 
the Gt4-2 mutation may cause a semidorainant pheno- 
type as a growth defect was observed in some heterozy- 
gous animals. One interpretation is that the Gt4-2 zinc 
finger-containing fusion protein has a dominant-nega- 
tive activity. The more severe pbenotype observed in 
homozygous animals may be a result of the higher levels 
of fusion protein activity found in these animals. It is 
also possible that the semidominant phenotype ts caused 
by a dosage effect owing to the loss of gene activity. As 
with all mutagenesis screens, the severity and nature of 
individual gene trap mutations may not be immediately 
obvious and must be examined carefully. The potential 
exists for pGT4.5 insertions to create dominant-nega- 
tive, partial loss of-function and gain of- function muta- 
tions, and not just null mutations. This may be the case 
for other gene trap vectors as well as for promoter traps. 

Mutagenesis screens in a variety of organisms, includ- 
ing the mouse, suggest that the vast majority of genes in 
the genome are apparently dispensable IDove \9%1Y In 
yeast, where more than half the genome is t ran sen bed, 
70% of insertional mutations generated in randomly se- 
lected regions of genomic DNA have no detectable phe 
notype (Coebl and Petes 1986). Similarly, based on a 
screen for chemically induced recessive phenotypes in 
the t locus of the mouse, it was estimated that 5O-100 
essential genes reside within this region that represents 
1% of the mouse genome (Shedlovsky et al. 1988) This 
number is an order of magnitude smaller than the num- 
ber of genes predicted to occupy this region (Ohno 1986). 
Recently, 24 lines of mice that contain insertions of a 
retrovirus-based gene trap vector have been generated 
and 9 were found to cause recessive embryonic lethal 
phenotypes (Friedrich and Soriano 1991). We have found 
one embryonic lethal mutation |Gt4-l) in three lines 
tested, and a second mutation (Gt4-2) appears to be le- 
thal in inbred mice. Taken together, these results are tn 
line with the prediction that most single gene lesions are 
not expected to produce obvious phenotypes. Because 
insertions that likely interrupt an endogenous gene can 
be preselected in ES cells, it is possible with this ap- 
proach to recover mutations that are phenotypically si- 
lent. 

The gene trap approach we describe here combines an 
effective DNA mutagen with a strategy, similar to the 
enhancer trap, to identify genes based on patterns of gene 
expression. This strategy offers a powerful tool for cata- 
loging the expression patterns of genes active during 
mouse embryogenesis and for creating insertional muta- 
tions that are immediately accessible to molecular char- 
acterization. 

Materials and methods 

DNA and RNA probes 

For the Northern and Southern blot analyses the endogenous 
probes that contained sequences S' of the lacZ splice site were 



subcloned from the PCR-amplified cDNA products into 
dCEM-1 IPromega). The clones consisted of the following nu- 
cleotides |.s shown in Fig. 1|: Gt4-1 5' |bp 72^08), Gt4-2 5' (bp 
107-302) GtlO 5' |bp 149-367). Probes that include endogenous 
sequences spaiining the splice site. Gt4-2 3' |0 6 kbp f coWl 
19-A-l ta 1.1-kb rVcoI-Kpnl Ct4-2 fragment), and GtlO 3 (O.S- 
kbp Xbol-Ssiil were cloned from cDNAs obtained in the screen 
of a 12 5-day cDNA library. The En 2 probe used lor genotyping 
transgenics |ES) was a 0.4-kb Bam\U-Sst\ fragment that in- 
cludes 0 2S kb of intron and 0.15 kb of exon sequence spanning 
the En 2 splice acceptor. The antisense RNA probes used in the 
RNase protection were cloned into pGLM 7Zf IPromega) and 
contained the following amplified cDNA sequences 13' — S'| 
5paniung the IccZ splice site: Gt4 la, 130 bp of the In 2 e*on 5 
homthc Kpn\ site to the splice site IFig. I A| r 2Z5 bp of G.4-I 
cDNA upstream of the splice site, and 60 bp of the pCEM7ZI 
polyitnker; Ct4 2a, the same 130 bp of En -2 exon, 335 bp of 
Gt4-2 cDNA, and 60 bp of the pCEM7Zf poly linker The 4 2M 
probe used in the PCR amplification experiment contained I4S 
bp immediately 3' of the En 2 splice site 

RNA and DNA blot analyses 

Total RNA was prepared by lysis of ES cells or tissues in guani 
dinium isothiocyanate followed by centrifugation through a ce- 
sium chloride cushion (Sarnbrook et al. 1989). RNA pellets were 
resuspended in 8 m urea, extracted several times with phenol- 
chloroform {1 : 1|, and stored as an ethanol precipitate. Geno- 
mic DNA was purified from proteinase K-digested ES cell pel- 
lets and tail biopsies by phenol-chloroform |l II extraction. 
Northern and Southern blots using GeneScreen (Dupont) were 
hybridized with random-primed DNA probes as desenbed (Joy 
ner et al. 1985}. 

Cloning and sequencing endogenous cDNAs 
The RACE strategy (Frohman et al. 1988) with several modifi- 
cations was used to amplify cDNA sequences 5' to lacZ. Five 
micrograms of total ES cell RNA was annealed with 10 ng of 
primer 170 (Fig lAl and first-strand cDNA was synthesized 
with revetse transcriptase as described previously (Frohman et 
al 1988). Prior to A- tailing, RNA in the reverse transcription 
reaction was hydrolyxed in alkali (0 2 n NaOH for 1 hr « 65-C). 
The single-stranded cDNA was then purified on a NACS col- 
umn IBethesda Research Laboratories}. A tailing of the cDNA 
with terminal deoxy transferase (Boerhinger Mannheim) was 
carried out at 37°C for 7.5 min in the presence of O 2 mM dATP 
as described previously (Frohman et al. 1988). The tailed prod- 
ucts were extracted once with phenol and once with phenol- 
chloroform (1 : 1) and precipitated in ethanol. The KJenow«- 
zyme (Pharmacia} was used to synthesize second suand cDMA 
in buffer containing 10 ng of pri mer HOT ( 5' -G CTTCTGTC- 
GACTATCGATGGGI Illililllltllll 1-3). TJ»e reac 
rion was incubated at room temperature for 30 mm and then at 
37-C for 30 rniri. The double- stranded cDNA product* were 
then subjected to 40 rounds of PCR (94*C for 90 sec; 65*C for 2 
min? 72-C for 15 min) using 3 units of Taq polymerase in lx 
Taq buffer (Perkin-Elmcr Cctus) containing I ug each of pnmer 
2 1 0 1 5 ' -OGTTCTGTCG ACT ATCGATGG C - 3 ' ) and pnmer 256 
[Fig. 1A). Fresh dNTPs and 1.5 units of Taq polymerase were 
added into the final round of PCR. 

The size range of the amplified products was visualized by 
Southern blot analysis, using the En 2 E5 probe. Size-selected 
0 S-kb cDNA was digested with the restriction enzymes Sail 
and Clal and cloned into pUClS. Mini-prep plasmid DNA was 
sequenced with the Sequenase kit (U S. Biochenucal) To obtain 
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sequence from both strands, exonucleasc HI deletions were gen- 
erated (Henikoff 1987| or specific primers were synthesized. 



cDSA library screen 

A XgtlO random-primed cDNA library, kindly provided by Mark 
Hanks (Mount Sinai Hospital, Toronto), was constructed using 
an Amersham kit from polytAK -selected 12 5 day embryonic 
CEM mouse RNA. The library contained 2 x 10* inserts and 
was amplified once. Screens of 2 * 10* phage with the Gt4-1 5^ 
probe, 1 x 10* phage with the Ct4-2 5' probe, and 0.5 * 10 6 
phage with the CtlO 5' probe yielded 0, 3, and 1 clone, respec- 
tively, with unique-sized cDNA inserts. The inserts were sub- 
cloned into pGEM TZf and partially sequenced to confirm their 
identity. 



RNase protection assay 

"P-Labeled antisense runoff transcripts were synthesized with 
SP6 or T7 polymerase using the Riboprobe kit (Promega). The 
reaction was treated with DNase I, and the probes were purified 
on 5% denaturing polyacrylamide sequencing gels. Two to five 
micrograms of total RNA was hybridized to 10 s cpm of gel- 
purified antisense probe and digested with RNase A and Tl as 
described previously (Melton et al. 1984). The protected RNA 
species were separated using denaturing polyacrylamide gel 
electrophoresis and visualized by autoradiography. 



PCR amplification of Gt4-2 transcripts 

In a volume of 20 u,l, 10 ng of primer 302 (Fig. IC) and 10 ng of 
primer 256 were used to prepare Ct4-2- and En-2-specific 
cDNAs from 2 u£ total RNA. Serial 10-fold dilutions of the 
cDNA products were subjected to 30 rounds of PCR in SO nl of 
buffer containing 1 u.g each ot primer 301 (Fig. 1C|, 302, 256, and 
an En 2 S' primer, SNT6. The 301/302 primers amplify the 
normal endogenous Gt4 2 transcript to produce a 480- bp cDNA 
fragment; the 302/256 primer combination amplifies a 450-bp 
cDNA fragment of the Gt4-2//acZ fusion transcript; and the 
SNT6/256 primers amplify the endogenous En-2 transcript to 
produce a 370-bp cDNA fragment containing 240 bp of exon I 
and 130 bp of exon 2. 

Production of ES cell chimeras 

C57BL/6 blastocysts were injected with 10-15 ES cells |D3) as 
described previously (Gossier et al. 1989J. In one experiment 
with the Gt2 cell line, CI>1 blastocysts after cell injection were 
delayed for 5 days in ovariectomized female recipients and then 
transferred to a new pseudopregnant female. Chimeric males, 
identified by coat color, were mated to either C57BL/6J |Ct4-l, 
Ct4-2, and Gt 10 lines), or CD-I <Ct2) or 129/Sv (Gt4-2) females. 



lacZ staining of whole mount and embryo sections 

Whole- mount embryos up to 12.5 days were fixed in 0.2% glu- 
taraldchydc (Sigma) and stained with X-gaJ (Bethesda Research 
Laboratories) as described previously (Beddington et al. 1989). 
Beyond 12-5 days, we found that staining of whole-mount em- 
bryos was ineffective owing to poor penetration of the stain. 
Later stage embryos were rinsed in PBS and then frozen slowly 
to -70TC in O.T-C- embedding medium (Tissue-Tek, Miles, 
Inc.). The embryos were stored frozen for up to 8 weeks prior to 



cry os tat sectioning. Frozen sections were fixed in 0.2% glu taral- 
dchydc, 2% formaldehyde, for 5 min, rinsed, and stained. 

In situ hybridization 

cDNA fragments were cloned into pCEM-72f (Promega). The 
rwo Gt4-2 probes used were Ct4-2 5* and 19-A-l. The GtlOS 
(sense) and CtlOX (antisense) probes contained a 1.1-kbp Xho\- 
Sstl subfragment of the cDN A clone that spanned the splice site 
of the GtlO endogenous cDNA. Each probe was nonrepetitive 
by Southern blot analysis of genomic DNA and detected a single 
major transcript by Northern analysis of ES cell RNA. 

Uniformly labeled **S-labeled RNA probes were synthesized 
with SP6 or T7 polymerase using a Riboprobe kit (Promega). 
CD-I embryos were cryostat sectioned at 7 um thickness and 
put onto aminopropyltriexthoxysilane-coated slides, fixed in 
20% paraformaldehyde, dehydrated, and stored at - 70*C. The 
slides were processed, hybridized under Gclbond covcrslips 
(FMQ, and washed as described previously (Hogan et al. 1986). 
The 2x SSC and 0.1 x SSC washes were carried out at SfTC. The 
slides were then dehydrated and dipped in Kodak NTB-2 emul- 
sion. Exposure times were between I and 4 days. At least two 
separate in situ experiments were carried out with each probe at 
each stage. Multiple 9.5-day embryos and at least two 12.5- and 
15.5-day embryos were analyzed. Both sagittal sections and 
transverse sections through the head and trunk of 12.5- and 
15.5-day embryos were analyzed. 



Note added in proof 

Sequence data described in this paper have been submitted to 
the EMBL/GenBank data libraries. 
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